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(57) ABSTRACT 

To make it possible to accurately track a prescribed part, a 
low-resolution overall image is taken in from an input 
image, and a medium-resolution image, whose center is the 
head part, is extracted from it. When it is possible to extract 
a medium-resolution image in the middle of which is the 
head part, an even higher-resolution image is extracted, in 
the middle of which is the two-eyes part. 
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IMAGE PROCESSING DEVICE AND FIG. 3 is a flowchart thai explains the expression data 

METHOD, AND DISTRIBUTION MEDIUM acquisition processing of the image processing device of 

FIG. 2. 

FIG. 4 is a flowchart that explains the active image 
FIELD OF THE INVENTION 5 acquisition processing in step SI of FIG. 3. 

This invention relates to an image processing device and FIG. 5 is a diagram that explains the active image 

method, and distribution medium. More specifically, it acquisition processing of FIG. 4. 

relates to an image processing device and method, and FIG. 6 is a diagram that shows an example of the display 

distribution medium, in which changes in a prescribed 1Q in step S2 in FIG. 3. 

image can be continuously extracted and tracked. £\q 7 js a diagram that shows an example of the display 



BACKGROUND OF THE INVENTION 



in step S7 in FIG. 3. 

FIG. 8 is a diagram that explains an example of the 

The recent proliferation of computer entertainment processing in step Sll in FIG. 3. 

devices has made it possible to enjoy games in every home, is fig. 9 is a diagram that explains an example of the 

In these computer entertainment devices, the game objects processing in step S13 in FIG. 3. 

(characters) are usually made to move in arbitrary directions nG 10 ^ a ^agram that exp i ains an cxamp le of the 

and at arbitrary speeds by users who manipulate buttons or processing in step S14 in FIG. 3. 

joysticks. u jg a flowchart that explains the party event 

Thus a conventional device is made so that various 20 processing in step S14 in FIG. 3. 

commands are input by manipulating buttons or joysticks. . ,. it _ A , . 

« - *v • w j FIG. 12 is a diagram that explains party event processing. 

This amounts to nothing more than mirroring button and & f f j v & 

joystick manipulation techniques in the game. The problem mG - 13 is a diagram that explains another example of 

has been that it is impossible to enjoy games that have more event processing. 

abundant changes. 25 FIG. 14 is a diagram that explains pyramid filter process- 

The present invention, which was devised with this situ- m S- 

a tion in mind, is intended to make it possible to enjoy games FIG. 15 is a flowchart that explains pyramid filter pro- 

that have more abundant changes. ccssing. 

The image processing device has a first extraction means 30 FIG. 16 is a diagram that explains the processing of steps 

that extracts the image of a prescribed part from an input S62 and S64 in FIG. 15. 

image; a second extraction means that extracts a part of the FIG. 17 is a diagram that explains inter-frame difference 

prescribed part extracted by the first extraction means as a processing. 

higher-resolution image; and a tracking means that tracks R g. 18 is a flowchart that explains inter-frame difference 

the image of the prescribed part so that the second extraction 35 processing, 
means can continuously extract the image of the prescribed 

part from the image extracted by the first extraction means. BRIEF DESCRIPTION OF THE INVENTION 

The image processing method also includes a first extrac- ^ . , . 

lion step that extracts the image of a prescribed part from an The ""f P™*^ 8 *»* » «<* faction means 

input image; a second extraction step that extracts a part of 40 < for C T P -°' P ^ ^ FIG .^, ,hat ( ex,lac,s b ° m ™ »°P ut 

the prescribed part extracted in the first extraction step as a * c ^ ° f a P^"^ • se u CODd «tract»n 

higher-resolution image; and a tracking step that tracks the means < for f xam P le ' ste P 835 m ™- £> that exti * c * 35 a 

image of the prescribed part so that in the second extraction ^^^toon T!P * P ; preSC " bed part 

step the image of the prescribed part can be continuously ?? racted ^ th ° ^^"S^^fi "t • 8 

extracted from the image extracted in the first extraction « ^^^SUmnOA^IiM^m^otibt 

t prescribed part so that the second extraction means can 

Jl M . „. . continuously extract the image of the prescribed part from 

Hie distribution medium provides a program that causes ^ { extracted by the ^ extraction means . 

processing to be executed on an image processing device ~ . • j • c , , 

and includes a first extraction step that extracts the image of ™ e ™* V™*™* d £ ,ce I^ *f ' d ^ h l ^ contro 

a prescribed part from an input image; a second extraction 50 means f ° r « am f le > f P S2 « nG 3 ) tha «"« ^uipiit 

step that extracts apart of the prescribed part extracted in the ma 8 e '° be 35 an ™» whose left and n & 1 w 

first extraction step as a higher-resolution image; and a reverse . 

tracking step that tracks the image of the prescribed part so processing device also has a display control 

that in the second extraction step the image of the prescribed means (for example, step S9 in FIG. 3) that causes to be 

part can be continuously extracted from the image extracted 55 displayed a prescribed image that is different from the input 

in the first extraction step image, and causes this image to be changed in conespon- 

A higher-resolution image is extracted so that the image deDce ^ ima S e extracted b * * cond extraction 



means. 



of the prescribed part can be continuously tracked from the 

input image. ^^3. * is a block diagram showing an example of the 

60 composition of an image processing system to which this 

BRIEF DESCRIPTION OF THE DRAWINGS invention is applied. As shown in this diagram, image 

processing devices 1-1 through 1-3 are connected to server 

FIG. 1 is a block diagram showing an example of the 3 via Internet 2. Image processing devices 1-1 through 1-3 

composition of an image processing system to which this (referred to hereafter simply as image processing device 1, 

invention is applied. 65 unless it is necessary to distinguish them individually; 

FIG. 2 is a block diagram showing an example of the likewise for the other devices as well) send to server 3 via 

composition of the image processing device of FIG. 1. Internet 2 the current position of the mask image of the user 
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in his own virtual reality space as well as data including the - * and in mode 2; the source pixel value is subtracted from the 

expression of the user mask image. Server 3 supplies to each destination pixel value and is drawn. And in mode 3, the 

image processing device 1, via Internet 2, image data on the source pixel value and destination pixel value are composed 

mask images of users positioned near the position, in virtual by assigning a weighting that corresponds to the a value of 

reality space, that is supplied from it. 5 the source. 

FIG. 2 is a block diagram showing an example of the The image data drawn to the drawing area of image 

composition of image processing device 1-1. Although not memory 43 is read out in PCRTC 44 via memory interface 

pictured, image processing devices 1-2 and 1-3 have a same 42, and from there it is output to and displayed on CRT 36. 

composition as image processing device 1-1. Next > tne operation is described with reference to the 

Connected to main CPU 31 via bus 34 are main memory 10 flowchart mRG - 3 * Jn step SI active image acquisi- 

32 and image processing chip 33. Main CPU 31 generates ? 0D P"»e™g » executed. The details of part of this active 

drawing commands and controls the operation of image maage acquisition processing are shown in the flowchart m 

processing chip 33. Stored in main memory 32 as appropri- Zl . ^ „„„ ... . , 
. .u j j r • r*™i *i ♦ That is, first, in step S31, PCRTC 44 takes in low- 
ate are the programs and data needed for main CPU 31 to " . . H , , ' r 
execute various processing. 15 ^solution image data of the entire screen from the image 

, _ ™, t input from video camera 35 and supplies it to and stores it 

In response to drawing commands supplied ftom CPU 31, m { mc 43 vm m mtcrfacc 42 In this way> 

rendering engine 41 of image processing chip 33 executes rocessing ^ 51 b slored in ^ ^ 50 of ^ 

operations that draw the prescribed image data to image me 43 ^ ^ Qym m nG 5 

memory 43 via memory interface 42 Bus 45 is connected 2Q ^ proceeding to step S32, main CPU 31 controls 

between memory interface 42 and rendering engine 41, and PCRTC 44 ^ executes processing that extracts the head 

bus 46 is connected between memory interface 42 and image of ^ viewed objecl (user) from ^ { ^ t m ^ 

memory 43. Bus 46 has a bit width of, for example, 128 bits, ^ That is, as shown in FIG. 5, the image 52 of the head 

and rendering engine 41 can execute drawing processing at fa extracted from processing image 51. In step S33, 

high speed to image memory 43. Rendering ; engine 41 has 25 main CPU decides whether the head part can ^ extracted 

^capacity to draw image data of 320x240 pixels of the from mc { takcn in st ^ and ^ it it rcturQS 

^Z^^o^ 5 ^ 131 ! 01 ^- system, for example, or image data of ^ ^ ^ cxccutes mc proccssing th at 

640x480 pixels in real time to Veo second). begins there 

Image processing chip 33 has programmable CRT con- [f m ste p S33 it is decided that the head part can be 

trailer (PCRTC) 44, and this PCRTC 44 has the function of 30 extracted, it proceeds to step S34, and main CPU 31 controls 

controlling in real time the position, size, and resolution of PCRTC 44 and takes in at medium resolution the region 

the image data input from video camera 35. PCRTC 44 whose center is the head part extracted in step S32. That is, 

writes the image data input from video camera 35 into the as shown in FIG. 5, medium -resolution image 52 whose 

texture area of image memory 43 via memory interface 42. center is the head part is taken in from low-resolution 

Also, PCRTC 44 reads via memory interface 42 the image 3S processing image 51 taken in step S31, and is stored in the 

data drawn in the drawing area of image memory 43, and image area of image memory 43. 

outputs it to and displays it on CRT 36. Image memory 43 Next, in step S35, main CPU 31 executes processing that 

has a unified memory structure that allows the texture area extracts the image of the two-eyes part from the medium- 

and drawing area to be specified in the same area. resolution image whose center is the head part that was taken 

Audio processing chip 37 processes the audio data input 40 in step S34. That is, it executes processing in which the 

from microphone 38 and outputs it from communication unit two-eyes part is extracted from medium-resolution image 52 

40 through Internet 2 to the other image processing device whose center is the head part in FIG. 5. In step S36, it is 
1. Also, via communication unit 40, audio processing chip decided whether the two-eyes part can be extracted, and if 
37 processes the audio data supplied from the other image it cannot be extracted, it returns to step S3 1, and the 
processing device 1 and outputs it to speaker 39. Commu- 45 processing that begins there is repeatedly executed, 
nication unit 40 exchanges data between the other image If in step S36 it is decided that the two-eyes part can be 
processing device 1 and server 3 via Internet 2. Input unit 30 extracted, main CPU 31 controls PCRTC 44, it proceeds to 
is operated when the user inputs various commands. step S37, and processing is executed that takes in at high 

According to the mode set by the blending mode setting resolution the region whose center is the two-eyes part That 

function Set-Mode (MODE) from CPU 31, rendering engine 50 is, high-resolution image 53 whose center is the two-eyes 

41 causes blending processing to be done between destina- part is taken in from medium-resolution image 52 whose 
tion pixel value DF(X,Y) in the drawing area of image center is the head part shown in FIG. 5, and is stored in 
memory 43 and texture area pixel value SP(X,Y). image area 50. 

Hie blending modes executed by rendering engine 41 Next > m ste P S 38 * main CPU 31 executes processing in 

include mode 0 through mode 3, and in each mode the 55 which . me two-eyes part from high-resolution image 53 

following blending is executed. taken in step S37 is extracted and its position is calculated. 

Mode 0* SPCX Y> m ste P ^ decided whether the two-eyes part can be 

Mode 1* DPOCY^+SPCX Y) extracted, and if it cannot, it returns to step S34, and the 

\n rv processing that begins there is repeatedly executed. If in step 

Mode 2: DP(X,Y)-SP(X,Y) 60 S39 it is decided that the two-eyes part can be extracted, it 

Mode 3: (l-a J?p (X,Y))*DP(X,Y) returns to step S37, and the processing that begins there is 

+a^(X,Y)*SP(X,Y) repeatedly executed. 

Here a^QC^Y) represents the a value of the source pixel As described above, the prescribed part is extracted with 

value. a higher-resolution image, and if the prescribed part cannot 

That is, in mode 0, the source pixel value is drawn to the 65 be extracted, it returns to a lower-resolution processing step 

destination without modification; in mode 1, the source pixel and the processing that begins there is repeatedly executed; 

value is added to the destination pixel value and is drawn; thus even if the user moves relative to video camera 35, his 
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' -two-eyes part isautomatically and surely tracked and can be 
taken in as an image. 

In step SI of FIG. 3, processing as above is executed that 
includes the processing shown in FIG. 4, the two-eyes part 
of the viewed object is automatically tracked, and if an 5 
image of the viewed object (an image of the face) is 
obtained, then in step S2, main CPU 31 controls rendering 
engine 41, generates a left-right reverse image, and outputs 
it to and displays it on CRT 36. That is, in response to 
commands from main CPU 31, rendering engine 41 converts 10 
the image of the user's face that was taken in step SI to an 
image in which its left and right are reversed (to its mirror 
image). This image in which left and right are reversed is 
output via PCRTC 44 to CRT 36 and is displayed as shown 
in FIG. 6. At this time, as shown in FIG. 6, main CPU 31 is 
controls rendering engine 41, and displays line PI superim- 
posed on the two-eyes extracted region extracted in steps 
S35, S37, and S38, allowing the user to be recognized. 

If in steps S35, S37, and S38 extraction is done for the 
mouth as well, then line P2 is displayed around the extracted 20 
region of the mouth, as shown in FIG. 6. 

If lines PI and P2 are displayed in this way, then the user 
will be able to extract the regions enclosed by these lines PI 
and P2 and recognize that tracking operations are being 
carried ouL 25 

Next, proceeding to step S3, the user looks at the display 
image of CRT 36 and decides whether it is necessary to 
make a positional adjustment of his own position relative to 
the position of video camera 35; if it is decided that it is 
necessary to make a positional adjustment, it proceeds to 30 
step S4, and the position of video camera 35 or the position 
of the user himself is appropriately adjusted. Then it returns 
to step SI, and the processing that begins there is repeatedly 
executed. 

If in step S3 it is decided that there is no need to adjust the 35 
position of video camera 35 or of the user himself, it 
proceeds to step S5, and main CPU 31 issues action instruc- 
tions for extracting the features of the face. That is, main 
CPU 31 controls audio processing chip 37 and gives the 
user, through speaker 39, instructions to perform prescribed 40 
actions, such as to turn his head, blink (wink), or open and 
close his mouth. Of course, these instructions may also be 
given by controlling rendering engine 41, drawing pre- 
scribed messages in image memory 43, and outputting these 
drawn messages to, and displaying them on, CRT 36 via 45 
PCRTC 44. 

Next, proceeding to step S6, main CPU 31 extracts, as 
changes in the image, the changes of the operations per- 
formed by the user in response to the action instructions in 
step S5, and extracts the facial-features region. That is, after, 50 
for example, an instruction to blink (wink) is given, the part 
in which a change occurs in the taken-in image is recognized 
as an eye. And after an instruction is given to open and close 
the mouth, the region of the image in which the change 
occurs is recognized as the mouth part. 55 

Next, proceeding to step S7, main CPU 31 generates a 
computer graphics image of a mask and controls rendering 
engine 41 to draw it superimposed on the display position of 
the image of the user's face. When this image is output to 
CRT 36 via PCRTC 44, an image is displayed on CRT 36 in 60 
which the face part of the user's image is replaced by a 
mask. 

Next, proceeding to step S8, main CPU 31 outputs from 
speaker 39 or CRT 3 6 a message instructing the user to move 
the facial-features region extracted in step S6 (for example, 65 
the eyes, mouth, or eyebrows). That is, the user is asked, for 
example, to wink, open and close his mouth, or move his 



eyebrows up and down. When the user winks, opens and 
closes his mouth, or moves his eyebrows up and down in 
response to this request, the image thereof is taken in via 
video camera 35. In step S9, main CPU 31 detects the region 
that changes in response to the action instruction as the 
change of the region that corresponds to the instruction, and 
in response to the detection results, it changes the corre- 
sponding part of the mask displayed in step S7. That is, 
when the user blinks (winks) in response to an action 
instruction to blink (wink) and this is detected, main CPU 31 
causes the eyes of the mask to blink (wink). Similarly, when 
the user opens and closes his mouth or moves his eyebrows 
up and down, the mouth of the mask is opened and closed 
and the eyebrows of the mask are moved up and down 
correspondingly. 

Next, in step S10, the user decides whether the position 
has been extracted correcdy. If, for example, the eye of the 
mask does not wink even though the user winked in response 
to a wink action instruction, the user, by operating the 
prescribed key on input unit 30, informs main CPU 31 that 
the correct extraction has not been carried out. Then, in step 
SU, main CPU 31 outputs a correction instruction. That is, 
the user is instructed to remain stationary, and a message is 
output to remove something moving in the background that 
is thought to be the cause of the misjudgment, or to change 
the lighting, etc. If there is anything behind the user that is 
moving, in response to this message the user removes it, or 
modifies the lighting. In addition, main CPU 31 gives 
instructions to put on a headband as shown in FIG. 8, or to 
put on a cap. When the user puts on a headband or cap in 
accordance with this instruction, this can be taken as the 
standard to detect the head part. Thus in this case it returns 
to step SI, and the processing that begins there is repeatedly 
executed. 

If in step S10 it is decided that the position has been 
extracted correctly, it proceeds to step S12, and it is decided 
whether the expression has been extracted correctly. That is, 
if for example, even though the user moves his cheek in 
response to an action instruction in step S8, the cheek of the 
mask displayed in step S9 does not change, then by oper- 
ating input unit 30 the user informs main CPU 31 that the 
expression extraction processing has not been successful. 
Then main CPU 31 outputs a correction instruction in step 
S13. For example, main CPU 31 instructs the user to put 
makeup on or mark the cheek part. If in response to this 
instruction the user puts makeup on or marks his cheeks, an 
image as shown in FIG. 9 will be taken in, so main CPU 31 
will be able to correctly extract the cheeks by taking this 
makeup or marking as a standard. Thus even in this case it 
returns from step S13 to SI, and the processing that begins 
there is repeatedly executed. 

If in step S12 it is judged that the expression can be 
extracted correctly, an image of the user is obtained that has 
a mask whose expression changes in response to changes in 
the user's face, as shown in FIG. 10. In this case, it proceeds 
to step S14, and party event processing is executed. The I 
details of this party event processing are shown in FIG. 11. ' 

First, in step S51, main CPU 31 generates the user's 
image, which has the mask generated in step S9, as the 
image of the virtual image of the user in virtual reality space, 
controls rendering engine 41, and draws it to image memory 
43. Next, in step S52, main CPU 31 reads the image data of 
the user's virtual image from image memory 43 and supplies 
it to communication unit 40. Then main CPU 31 further 
controls communication unit 40 and transmits this image 
data to server 3 via Internet 2. At this time, main CPU 31 
simultaneously also sends the position data corresponding to 
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the- position of the' user- mask image in the virtual reality 'processing image of nxn (where a is a power of -2). When 

space provided by server 3, in response to operations from this processing is executed repeatedly, ultimately the data of 

input unit 30. the one pixel at the apex of the pyramid becomes pixel data 

Then in step S53, main CPU 31 controls audio processing that represents the average value of all the pixels at the base 

chip 37 and causes user audio data input from microphone 5 of the pyramid. 

38 to be transmitted from communication unit 40 through If such pyramid processing is to be done, main CPU 31 

Internet 2 to server 3. outputs the following drawing commands to rendering 

When image data of the corresponding user mask image, engine 41. 

position data in the virtual reality space, and audio data are ml Length of a side of the source area*/ 

input via Internet 2, for example from image processing 10 mt og^. 

device 1-1. server 3 supplies this data to the image process- . ' . - . , - . ..... 

tag device 1 (for example image processing device 1-2 and U2N J lett ^ of one ade of toe ^ mt ^' 

image processing device 1-3) positioned near its position offset=0; 

and for which the user mask image corresponds. Similarly, while (L>1){ 

when the user mask image data, position data, and audio data 15 Sei_Texture_Base(0,oflset);/*set basepoint of texture 

is transferred via Internet 2 from image processing devices area*/ 

1-2 and 1-3, server 3 outputs this data to image processing oflset+=L; 

device 1-1 via Internet 2. Set_Drawmg_Base(0,oflset);/*set basepoint of draw- 
When thus the user mask image data, its position data, and m g area*/ 
audio data are transferred from the other image processing 20 Flat_Texture__Rectangle(0,0^72,0,L/2,I72,0,L/2,0_5, 
devices 1-2 and 1-3, main CPU 31 of image processing 0.5,L+0.5, 0.5,L+0.5,L+0.5,0.5,L+0.5,1.0): 
device 1-1 receives this in step S54. And main CPU 31 L=IV2; 
controls rendering engine 41 and draws to image memory 43 } 

the image of the corresponding user mask image in the When these drawing commands are expressed in flow- 
corresponding position on the image in virtual reality space. 25 chart form, we get what is shown in FIG. 15. First, in step 
Then this drawn image data is read by PCRTC 44 and is S61, the variable "offset" is initialized to 0. Next, in step 
output to and displayed on CRT 36. Also, main CPU 31 S62, processing is executed in which the basepoint of texture 
outputs the transmitted audio data to audio processing chip area 51 is set to (0,offset), That is, as shown in FIG. 16, 
37, causes audio processing to be done on it, then causes it basepoint 1X0,0) is set. Next, proceeding to step S63, the 
to be output from speaker 39. 30 variable "offset** is incremented by L. Then, in step S64, 
As described above, for example as shown in FIG. 12, the (0,of£set) is set to the basepoint of drawing area 52 [sic; S64 
user mask images 61-2 and 61-4 of other users (in the actually says the opposite: "set basepoint of drawing area to 
display example of FIG. 12, users B and D) are displayed on (0,offset)"]. In this case, as shown in FIG. 16, basepoint 
CRT 36-1 of image processing device 1-1, which is used by D(0,L) is set 

user A. And appearing on CRT 36-2 of image processing 35 Next, in step S65, processing is executed in which draw- 
device 1-2 of user B are user mask image 61-1 of user A and ing is done by multiplying the pixel values of quadrilateral 
user mask image 61-3 of user C. When user A talks, this is (0.5,0.5,L+0.5,0.5,L+0.5,L+0.5,0.5,L+0.5) of the source 
taken in by microphone 38 and is played for example from (texture area) by 1 and adding it to quadrilateral (0,0,L/2, 
/ speaker 39 of image processing device 1-2 of user B. And 0,L/2,L/2,0,L/2) of the destination. That is, in this way one 
because at this time the image of user A is taken in by video 40 obtains from the lowermost processing image (on the base of 
camera 35-1, the mouth of user mask image 61-1 of user A the pyramid) shown in FIG. 14 the processing image of one 
changes corresponding to the mouth of user A- Similarly, layer higher. 

when user B changes his facial expression, this is taken in Next, proceeding to step S66, the variable L is set to Vi its 

\ by his video camera 35-2, and the facial expressions of user current value. In step S67, it is decided whether variable L 

v virtual image 61-2 of user B change. 45 is greater than 1; if variable L is greater than 1, it returns to 

The above -described processing is repeatedly executed step S62, and the processing that begins there is repeatedly 

until a termination instruction is given in step S55. executed. That is, in this way, the image data of the third 

In the above, a virtual party is enjoyed in-a virtual reality layer is obtained from the second layer, 

space with the mask images of many users via server 3, but Thereafter, similar processing is repeatedly executed, and 

it is also possible to enjoy a one-on-one virtual party 50 if in step S67 it is decided that variable L is not greater than 

^between user A and user B, as shown in FIG. 13. 1 (if it is decided that variable L is equal to 1), the pyramid 

If the head part is extracted in the processing in step 32 of filter processing terminates. 
FIG. 4, it is possible to extract the head part by taking as the If the facial-features region is extracted in step S6 in 
standard, for example, the color of the hair. In this case, response to action instructions in step S5 of FIG. 3, the 
pyramid filter processing can be employed. When pyramid 55 region that changes in response to the action instructions 
filter processing is done, the average value of the pixel (the moving part) can be extracted by performing inter- 
values is calculated, so the region in which this average frame difference processing. 

value is close to the pixel value of the color of hair can be Next, we explain inter- frame difference processing. In this 

extracted as the hair region. inter-frame difference processing, the difference between the 

Next, we explain pyramid filter processing. In this pyra- 60 image of a frame at time t and the image of the frame at time 

mid filter processing, as shown in FIG. 14, processing is t+1 is calculated as shown in FIG. 17. In this way, the area 

repeated in which one determines the average value of four of an image in which there is movement can be extracted, 

mutually adjacent pixel values of the processing image, and That is, in this case, main CPU 31 causes rendering engine 

arranges the pixel in the center of the four pixels. That is, 41 to execute processing as shown in the flowchart in FIG. 

when processing is executed in which the average pixel 65 18. First, in step S81, rendering engine 41, in response to 

value of four nearby points is calculated by bilinear instructions from main CPU 31, sets mode 2 as blending 

interpolation, image data of (n/2)x(n/2) is obtained from a mode. Next, in step S82, among the image data input from 
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- video- camera 35, rendering engine 41 takes the image data > 
of a temporally later frame as a destination image and takes 
the image data of a temporally previous frame as source 
image data. Then, in step 83, rendering engine 41 executes 
processing in which drawing is done by subtracting the pixel 5 
value of the source quadrilateral from the pixel value of the 
destination quadrilateral. The pixel data of a frame in the 
destination area and the image data of a frame in the source 
area have essentially an equal value in the still-picture 
region. As a result, when the processing in step S83 is to 
executed, the value of the image data is roughly zero. 

By contrast, the value of the image data in a region where 
there is movement is different depending on whether it is in 
the destination or in the source. Therefore the value of the 
image data obtained as a result of the processing in step S83 is 
becomes a value that has a prescribed size other than zero. 
Thus one can distinguish whether it is a moving region or a 
still region from the size of the value of each pixel data of 
the image data of the inter-frame difference. 

In this specification, "system" means the whole of a 20 
device that consists of multiple devices. 

As the distribution medium for providing the user with a 
computer program that performs the above -described 
processing, one can employ either a recording medium such 
as magnetic disk, CD-ROM, or solid memory, or a commu- 25 
nication medium such as network or satellite. 

As described above, with the image processing device, the 
image processing method, and the distribution medium of 
the present invention, an image once extracted is extracted 
as an even higher-resolution image, making it possible to 30 
accurately track a prescribed part 

What is claimed is: 

1. An image processing device comprising: 

a first extraction means for extracting an image of a 
prescribed part from an input image; 35 

a second extraction means for extracting a portion of said 
image of said prescribed part extracted by said first 
extraction means as a higher-resolution image; 

a tracking means for tracking the image of said prescribed ^ 
part so that said second extraction means can continu- 
ously extract the portion of said image of said pre- 
scribed part from the image; 

a display means for displaying the input image; 

a rendering means for rendering a virtual image that is 45 
different from the input image, said virtual image 
including a feature that corresponds to the portion of 
said image of said prescribed part; and 

a display control means that causes the virtual image to be 
displayed on the display means by sur^rimposing the 50 
virtual image on the input image, and causes the feature 
that corresponds to the portion of said image of said 
prescribed part to change in correspondence with a 
change in the portion of said image of said prescribed 

- part as it is continuously extracted. ■ 55 

2. The image processing device of claim 1, wherein said 
rendering means renders the input image for display as an 
image whose left and right are reversed. 

3. The image processing device of claim 1 wherein the 
input image is an image that is output from a video camera. 60 

4. The image processing device of claim 3 wherein the 
prescribed part that the first extraction means extracts from 
an output image is the eyes or mouth of a user imaged by 
video camera. 



*» -5. The image processing device of claim 1- wherein the 
first extraction means performs processing in which the 
prescribed part of the image is extracted from the input 
image by pyramid filter processing. 

6. The image processing device of claim 1 wherein the 
first extraction means performs processing in which the 
prescribed part of the image is extracted from the input 
image by inter-frame difference processing. 

7. The image processing device of claim 1, wherein the 
input image comprises a face, and the virtual image com- 
prises a mask. 

8. An image processing method comprising: 

a first extraction step for extracting an image of a pre- 
scribed part from an input image; 

a second extraction step for extracting a portion of said 
image of said prescribed part extracted in the first 
extraction step as a higher-resolution image; 

a tracking step for tracking the image of said prescribed 
part so that in the second extraction step the portion of 
said image of said prescribed part can be continuously 
extracted from the image extracted in the first extrac- 
tion step; 

a rendering step for rendering a virtual image that is 
different from the input image, said virtual image 
including a feature that corresponds to the portion of 
said image of said prescribed part; and 

a display control step that causes the virtual image to be 
displayed on a display by superimposing the virtual 
image on the input image, and causes the feature that 
corresponds to the portion of said image of said pre- 
scribed part to change in correspondence with a change 
in the portion of said image of said prescribed part as 
it is continuously extracted. 

9. A distribution medium comprising a program that 
causes processing to be executed on an image device includ- 
ing: 

a first extraction step for extracting an image of a pre- 
scribed part from an input image; 

a second extraction step for extracting* a portion of said 
image of said prescribed part extracted in the first 
extraction step as a higher-resolution image; 

a tracking step for tracking the image of said prescribed 
part so that in the second extraction step the portion of 
said image of said prescribed part can be continuously 
extracted from the image extracted in the first extrac- 
tion step; 

a rendering step for rendering a virtual image that is 
different from the input image, said virtual image 
including a feature that corresponds to the portion of 
said image of said prescribed part; and 

a display control step that causes the virtual image to be 
displayed on a display by superimposing the virtual 
image on the input image, and causes the feature that 
corresponds to the portion of said image of said pre- 
scribed part to change in correspondence with a change 
in the portion of said image of said prescribed part as 
it is continuously extracted. 

10. The distribution medium according to claim 9 wherein 
the program is provided with in addition to a recording 
medium such as magnetic disk, CD-ROM, or solid memory. 
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